R is a script-based language. You write down a list of instructions and it will follow, performing one action after another. This is different to ‘point and click’ software like Microsoft Excel, and it can feel a bit cumbersome.
In Excel, you can perform a series of steps:
R is free, open-source and powerful. In the past five years, it has also become easier to use and get started with.
…
(screen grabs)
Installing a package is like installing an app on your phone. It….
You can install a package using the install.packages function. Note that there will be lots of text that appears o
Now we need to load it using the library function; like opening an app you have installed on your phone. We do this every time (every ‘session’) we want to use it.
Notice below that we received some messages and warnings when we loaded the
library(tidyverse)
## ── Attaching packages ──────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0 ✔ purrr 0.2.5
## ✔ tibble 2.0.1 ✔ dplyr 0.7.8
## ✔ tidyr 0.8.2 ✔ stringr 1.4.0
## ✔ readr 1.3.1 ✔ forcats 0.3.0
## Warning: package 'tibble' was built under R version 3.5.2
## Warning: package 'stringr' was built under R version 3.5.2
## ── Conflicts ─────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()This
This uses the read_csv function and, here, we’re only going to give it one argument: the path to the csv file you want to read in quotation marks.
Tip: open quotation marks and hit tab to choose your file (and save you some typing).
## Parsed with column specification:
## cols(
## country = col_character(),
## continent = col_character(),
## year = col_double(),
## lifeExp = col_double(),
## pop = col_double(),
## gdpPercap = col_double()
## )
## # A tibble: 1,704 x 6
## country continent year lifeExp pop gdpPercap
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Afghanistan Asia 1952 28.8 8425333 779.
## 2 Afghanistan Asia 1957 30.3 9240934 821.
## 3 Afghanistan Asia 1962 32.0 10267083 853.
## 4 Afghanistan Asia 1967 34.0 11537966 836.
## 5 Afghanistan Asia 1972 36.1 13079460 740.
## 6 Afghanistan Asia 1977 38.4 14880372 786.
## 7 Afghanistan Asia 1982 39.9 12881816 978.
## 8 Afghanistan Asia 1987 40.8 13867957 852.
## 9 Afghanistan Asia 1992 41.7 16317921 649.
## 10 Afghanistan Asia 1997 41.8 22227415 635.
## # … with 1,694 more rows
Looks good! But it isn’t in our Environment (on the right) yet because we didn’t assign it to anything.
## Parsed with column specification:
## cols(
## country = col_character(),
## continent = col_character(),
## year = col_double(),
## lifeExp = col_double(),
## pop = col_double(),
## gdpPercap = col_double()
## )
Now it is in our Global Environment over there —> wooh!
Much like Excel, we can explore the gapminder dataset with our eyes.
View will open up a new tab that displays your dataset. You can scroll through it.
head will print just the first few observations. This is handy to check on things as you’re going along.
## # A tibble: 6 x 6
## country continent year lifeExp pop gdpPercap
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Afghanistan Asia 1952 28.8 8425333 779.
## 2 Afghanistan Asia 1957 30.3 9240934 821.
## 3 Afghanistan Asia 1962 32.0 10267083 853.
## 4 Afghanistan Asia 1967 34.0 11537966 836.
## 5 Afghanistan Asia 1972 36.1 13079460 740.
## 6 Afghanistan Asia 1977 38.4 14880372 786.
names will display the names of all variables in the dataset (and is often the answe to ‘what was that variable called again…’)
## [1] "country" "continent" "year" "lifeExp" "pop" "gdpPercap"
plotlyNow close your eyes and picture the gapminder dataset: * Add a new column to the right with the name ‘my_column’. * Only keep rows from 2007 * Then remove the ‘year’ column
gapminder07 <- gapminder %>% # Assign gapminder07 to: the gapminder dataset, then
mutate(gdp = gdpPercap * pop) %>% # create a new column called gdp, then
filter(year == 2007) %>% # keep only observations from 2007, then
select(-gdpPercap) # drop the gdpPercap variable (negative select)We want to make our plots as clear as possible…
##
## Attaching package: 'scales'
## The following object is masked from 'package:purrr':
##
## discard
## The following object is masked from 'package:readr':
##
## col_factor
# with a log scale
gapminder07 %>%
ggplot(aes(x = lifeExp,
y = gdp)) +
geom_point() +
scale_y_log10(label = comma)# with colour
gapminder07 %>%
ggplot(aes(x = lifeExp,
y = gdp,
colour = continent)) +
geom_point() +
geom_line(aes(group = country)) +
scale_y_log10(label = comma)## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
# with colour and facet
gapminder07 %>%
ggplot(aes(x = lifeExp,
y = gdp,
colour = continent)) +
geom_point() +
geom_line(aes(group = country)) +
scale_y_log10(label = comma)+
facet_wrap(~ continent)## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
# with colour and facet
gapminder07 %>%
mutate(decade = signif(year, 3)) %>%
ggplot(aes(x = lifeExp,
y = gdp,
colour = continent,
size = pop)) +
geom_point() +
geom_line(aes(group = country)) +
scale_y_log10(label = comma)+
facet_grid(decade ~ continent)## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
# with colour
gapminder07 %>%
ggplot(aes(x = lifeExp,
y = gdp,
colour = continent)) +
geom_point(alpha = 0.5) +
geom_line(aes(group = country)) +
scale_y_log10() +
facet_wrap(~ continent)## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
At some point throughout your university life you will need to write equations in a document.
$A = (r^{4}) / $